feat: add package metadata for wheel libraries#3531
feat: add package metadata for wheel libraries#3531stevebarrau wants to merge 2 commits intobazel-contrib:mainfrom
Conversation
Add package_metadata rule to generated BUILD files for wheel libraries to track package provenance using PURL (Package URL) format.
python/private/pypi/whl_library.bzl
Outdated
| entry_points[entry_point_without_py] = entry_point_script_name | ||
|
|
||
| namespace_package_files = pypi_repo_utils.find_namespace_package_files(rctx, rctx.path("site-packages")) | ||
| purl = "pkg:pypi/{}@{}".format( |
There was a problem hiding this comment.
What happens if we have private packages in here? Is pkg:pypi/{}@{} only a type of registry or the actual public PyPI?
There was a problem hiding this comment.
IIUC registry information is not yet present in the metadata object, right? IIUC the most common approach is to have an --index-url instruction in the requirements_lock file. DO you know where I can fish out this information from within the rules?
There was a problem hiding this comment.
Yes, the --index-url may be present in the extra_pip_args attribute of whl_library but if we know the URL to download the wheel from, we don't pass the extra_pip_args.
Given that sometimes people chose to use a fall-through cache mirror (Artifactory, et all) for public PyPI packages, I feel like the purl should be still pointing to the public PyPI and only for things not found on PyPI we should point to a particular registry.
What is the spec here that you are implementing against?
python/private/pypi/whl_library.bzl
Outdated
| "pypi_version={}".format(metadata["version"]), | ||
| ], | ||
| namespace_package_files = namespace_package_files, | ||
| purl = purl, |
There was a problem hiding this comment.
This also needs to be added above, in the other branch of the code.
|
What are other things that are needed for #2054 to be resolved? |
| bazel_dep(name = "bazel_features", version = "1.21.0") | ||
| bazel_dep(name = "bazel_skylib", version = "1.8.2") | ||
| bazel_dep(name = "rules_cc", version = "0.1.5") | ||
| bazel_dep(name = "package_metadata", version = "0.0.6") |
| package_metadata( | ||
| name = "package_metadata", | ||
| purl = {purl}, | ||
| visibility = ["//:__subpackages__"], |
There was a problem hiding this comment.
Why limited visibility when the rest is public?
There was a problem hiding this comment.
I followed go from the gazelle repo: https://github.com/bazel-contrib/bazel-gazelle/blob/7b9d7f36a9278df5637cdb82f660c46b8df2aae4/internal/go_repository.bzl#L432-L438
There was a problem hiding this comment.
Ah... but I wonder if they did it right :-)
Usually, tools pick up the metadata by aspect traversal, so visibility is not needed.
But... we are building out an override system that lets you apply alternate metadata for a package. People will certainly do things like splice in license and copyright declarations when they have researched it and the wheel doesn't contain it. I could make a case that those overrides should point to the label of the the thing the aspect will find. But we could do string of the target.
@Yannic: you did the go thing. Why limited visiblity?
| logger = logger, | ||
| ) | ||
|
|
||
| def _to_purl(metadata): |
There was a problem hiding this comment.
what if it's a wheel library from outside of pypi? I've worked on this before and I took this approach: https://github.com/bazel-contrib-supply-chain-extras/rules_python/pull/1/changes
Add package_metadata rule to generated BUILD files for wheel libraries to track package provenance using PURL (Package URL) format.
This is then picked up by supply_chain_tools to produce SBOM for python target using external dependencies.